Improved Speech Synthesis Using Fuzzy Methods

نویسندگان

  • Doina Jitca
  • Horia-Nicolai L. Teodorescu
  • Vasile Apopei
  • Florin Grigoras
چکیده

The paper presents theoretical support for and describes the use of a fuzzy paradigm in implementing a TTS system for the Romanian language, employing a rule-based formant synthesizer. In the framework of classic TTS systems, we propose a new approach in order to improve formant trace computation, aiming at increasing synthetic speech perceptual quality. A fuzzy system is proposed for solving the problem of the phonemes that are prone to multi-definitions in rule-based speech synthesis. In the introductory section, we briefly present the background of the problem and our previous results in speech synthesis. In the second section, we deal with the problem of the context-dependent phonemes at the letter-to-sound module level of our TTS system. Then, we discuss the case of the phoneme /l/ and the solution adopted to define it for different contexts. A fuzzy system is associated with each parameter (denoted F1 and F2) to implement the results of the complete analysis of the phoneme /l/ behavior. The knowledge used in implementing the fuzzy module is acquired by natural speech analysis. In the third section, we exemplify the computation of the synthesis parameters F1 and F2 of the phoneme /l/ in the context of the two syllable sequences. The parameter values are contrasted with those obtained from the spectrogram analysis of the natural speech sequences. The last section presents the main conclusions and further research objectives.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Prosody Modelling for Improved Synthetic Speech Quality

Neural networks and fuzzy logic have proven to be efficient when applied individually to a variety of domain-specific problems, but their precision is enhanced when hybridized. This contribution presents a combined framework for improving the accuracy of prosodic models. It adopts the Adaptive Neuro-fuzzy Inference System (ANFIS), to offer selftuned cognitive-learning capabilities, suitable for...

متن کامل

A nonlinear unit selection strategy for concatenative speech synthesis based on syllable level features

This paper describes an improved algorithm, motivated by fuzzy logic theory, for the selection of speech segments for concatenative synthesis from a huge database. Triphone HMM clustering is employed as an adaptive measure for articulatory similarity within a given database. Stress level contours are evaluated in the context of their surrounding vocalic peaks. The algorithm uses a beam search t...

متن کامل

Soft-computing Methods for Text-to-Speech Driven Avatars

This paper presents a new approach for driving avatars with text-to-speech synthesis that uses pure text as an information source. The goal is to move lips and face muscles on the basis of the phonetic nature of the utterance and the related expression. Several methods came together to define this solution. Rule-based text-to-speech synthesis generates phonetic and expression transcription of t...

متن کامل

Using Fuzzy Sets to Model Paralinguistic Content in Speech as a Generic Solution for Current Problems in Speech Recognition and Speech Synthesis

Current problems in speech processing exist due to infinite variations of speech utterances. No two speech utterances are exactly alike, even if they are linguistically the same word. The difference is therefore, due to the paralinguistic content of the speech utterances. This leads to the conceptualization of the paralinguistic content of speech as arising from infinite variation. Infinite var...

متن کامل

Recognition of speech commands using a modified neural fuzzy network and an improved GA

This paper presents the recognition of speech commands using a modified neural-fuzzy network. To train the parameters of the network, an improved genetic algorithm is proposed. As an application example, the proposed speech recognition approach is implemented in an Electronic Bonk experimentally to illustrate the design and its merits.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • I. J. Speech Technology

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2002